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Abstract 

Deterministic  Galerkin  approximations  of  a  class  of  second  order  elliptic  PDEs  with 
random  coefficients  on  a  bounded  domain  D  C  are  introduced  and  their  convergence 
rates  are  estimated.  The  approximations  are  based  on  expansions  of  the  random  diffusion 
coefficients  in  L2(D)-orthogonal  bases,  and  on  viewing  the  coefficients  of  these  expansions  as 
random  parameters  y  =  y(u>)  =  (yi(u>)).  This  yields  an  equivalent  parametric  deterministic 
PDE  whose  solution  u(x,  y)  is  a  function  of  both  the  space  variable  x  £  D  and  the  in  general 
countably  many  parameters  y. 

We  establish  new  regularity  theorems  decribing  the  smoothness  properties  of  the  solution 
u  as  a  map  from  y  €  U  =  (—1, 1)°°  to  V  =  Hq(D).  These  results  lead  to  analytic  estimates 
on  the  V  norms  of  the  coefficients  (which  are  functions  of  x)  in  a  so-called  “generalized 
polynomial  chaos”  (gpc)  expansion  of  u. 

Convergence  estimates  of  approximations  of  u  by  best  TV-term  truncated  V-valued  poly¬ 
nomials  in  the  variable  y  £  U  are  established.  These  estimates  are  of  the  form  N~r,  where 
the  rate  of  convergence  r  depends  only  on  the  decay  of  the  random  input  expansion.  It 
is  shown  that  r  exceeds  the  benchmark  rate  1/2  afforded  by  Monte-Carlo  simulations  with 
TV  “samples”  (i.e.  deterministic  solves)  under  mild  smoothness  conditions  on  the  random 
diffusion  coefficients. 

A  class  of  fully  discrete  approximations  is  obtained  by  Galerkin  approximation  from  a 
hierarchic  family  {V)}/20  C  V  of  finite  element  spaces  in  D  of  the  coefficients  in  the  TV- 
term  truncated  gpc  expansions  of  u(x,y).  In  contrast  to  previous  works,  the  level  l  of 
spatial  resolution  is  adapted  to  the  gpc  coefficient.  New  regularity  theorems  decribing  the 
smoothness  properties  of  the  solution  u  as  a  map  from  y  £  U  =  (—1, 1)°°  to  a  smoothness 
space  W  C  V  are  established  leading  to  analytic  estimates  on  the  W  norms  of  the  gpc 
coefficients  and  on  their  space  discretization  error.  The  space  W  coincides  with  H2(D )  fl 
Hq(D)  in  the  case  where  I?  is  a  smooth  or  convex  domain. 

Our  analysis  shows  that  in  realistic  settings  a  convergence  rate  TV^/  ^  in  terms  of  the 
total  number  of  degrees  of  freedom  Nd.o.f  can  be  obtained.  Here  the  rate  s  is  determined 
by  both  the  best  TV-term  approximation  rate  r  and  the  approximation  order  of  the  space 
discretization  in  D. 
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1  Introduction 

Partial  differential  equations  with  stochastic  input  data  are  a  commonly  used  paradigm  in  sci¬ 
ence  and  engineering.  Stochasticity  typically  reflects  the  uncertainty  in  the  various  parameters 
entering  the  physical  phenomenon  described  by  the  equation.  A  simple  yet  relevant  model 
problem  is  the  elliptic  equation 

-V  •  (aV'«)  =  /  in  D,  u\dD  =  0,  (1.1) 

in  a  bounded  Lipschitz  domain  D  C  Rrf.  We  assume  that  /  =  f(x )  is  a  given  deterministic 
function  in  L2(D)  and  that  the  diffusion  coefficient  a  in  (1.1)  is  a  random  field  on  a  probability 
space  (0,S,P)  over  L°°(D)  (see,  e.g.,  [11]).  In  particular,  given  any  ip  £  L2(D)  and  any  Borel 
subset  A  of  M,  the  set  {iu  £  fl  :  (a(-,  cu),  ip)  £  A}  £  S  where  (-,-)l2(D)  denotes  the  L2(D) 
innerproduct  and  lo  £  fl  represents  a  draw  of  this  field  with  respect  to  the  probability  P. 

In  this  model,  stochasticity  is  therefore  used  to  describe  the  uncertainty  in  the  diffusion 
coefficient  a.  In  order  to  ensure  uniform  ellipticity,  we  make  the  following  assumption. 

Assumption  1.  There  exist  constants  0  <  am;n  <  amax  such  that 

®min  i;  Qi(x,  Lo)  <  Umax;  (1.2) 


holds  for  all  (x,lo)  £  D  x  fi. 


By  the  Lax-Milgram  lemma,  this  assumption  immediately  implies  for  every  to  £  fl  the  exis¬ 
tence  of  a  solution  u(-,co)  in  the  space  Hq(D),  in  the  sense  of  the  variational  formulation: 

J  a(x,  w)Vm(x,  lu)  ■  Vv(x)dx  =  J  f(x)v(x)dx,  for  all  v  £  Hq(D),  (1-3) 


D  D 

where  the  gradient  V  is  taken  with  respect  to  the  x  variable.  This  solution  satisfies  the  estimate 

\\u{-,  uj)  ||y  <  B  :=  Mill —  for  all  c u  £  fl.  (1-4) 

ttmin 

Here,  and  in  all  the  following,  we  denote  by  V  the  space  Hq(D),  equipped  with  the  energy  norm 
IMIv  :=  HVvIUap)  and  by  V*  its  dual  H^1(D).  The  estimate  (1.4)  also  reads 

sup  ||t/(-,u;)||y  <  B.  (1.5) 
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The  solution  u  =  u(x,  u)  is  a  random  field  associated  to  the  probability  space  (0,  £,  P).  Numer¬ 
ical  methods  have  been  developed  in  order  to  approximately  compute  quantities  which  describe 
the  probabilistic  behaviour  of  the  field  u.  Such  quantities  are  typically  the  statistical  moments 
of  u,  such  as 

(i)  the  mean  field  u  which  is  defined  as  (formal)  “ensemble  average” 

u  :=  E (u)  =  j  u{-,u)dP(uS)  G  V  , 
n 


(ii)  the  covariance  function 

Cu  :=  E ([it  —  u)]  <g>  [it  —  «])  G  V  <g>  V  . 

Since  u  is  in  general  not  a  Gaussian  process,  u  and  Cu  only  give  partial  information  on  the 
probability  distribution  of  u.  We  distinguish  two  general  numerical  approaches  for  computing 
such  quantities:  Monte-Carlo  (MC)  methods  and  deterministic  methods. 


Monte-Carlo  methods:  These  are  based  on  N  independent  draws  {ai,  •  •  • ,  a at}  of  the  random 
coefficient  a.  For  each  instance  a*,  they  compute  the  solution  ut  to  the  equation  — V-(ajVrq)  =  / 
and  use  the  resulting  sample  {u\,  •  •  • ,  un}  to  estimate  the  quantities  of  interest.  For  example, 
the  mean  field  u  is  approximated  by 


UN  ■  = 


i=  1 


(1.6) 


The  fact  that  the  Ui  are  independent  and  their  laws  are  identical  to  the  law  of  u  implies 

E(ll«-^iv||y)  =  ^E(||u  —  u\\y) 

and,  since  E(||it  —  u\\y)  <  E(||u||y),  we  obtain  with  the  Cauchy-Schwarz  inequality, 

E(||u  -  uN\\v)  <  (EiWufy^N-^  (1.7) 


i.e.  Monte-Carlo  approximations  with  N  samples  converge  with  rate  1/2  provided  that  the 
solution  u  as  a  F-valued  random  function  has  finite  second  moments.  If  u  has  lower  summability, 
lower  convergence  rates  for  the  MC  approximation  (1.6)  will  result:  interpolating  (1.7)  with  the 
straightforward  bound  E(||u  —  un\\v)  <  2E(||u||y)  implies  the  reduced  rate  N~r  with  r  = 
1  —  1/q  G  [0, 1/2]  provided  E(||rt||y)  <  oo  for  some  q  G  [1,2]  (see  e.g.  [14]).  Better  summability 
of  u  in  the  lo  variable,  such  as  e.g.  (1.5),  however,  does  not  generally  allow  to  improve  the 
convergence  rate  of  the  MC  approximation  (1.6)  beyond  r  =  1/2. 

In  practice,  the  Ui  in  (1.6)  are  computed  approximately  by  space  discretization,  for  example 
by  the  finite  element  method.  The  computable  approximation  to  u  is  thus  given  by 


UN,h  := 


1 

N 


N 

'y  ]  ui,hi 

i— 1 
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where  Ui.h  is  the  Galerkin  approximation  to  ut  in  the  finite  element  space  V/  which  need  to  be 
chosen  such  that  the  corresponding  discretization  error  does  not  affect  the  MC  rate  (1.7).  The 
total  complexity  of  the  solution  process  is  thus  at  least  of  order  O(NM)  where  M  =  dim(V/). 

Leaving  aside  the  space  discretization  aspects,  we  note  the  following  advantages  and  draw¬ 
backs  of  the  MC  approximation  (1.6): 

•  On  the  positive  side,  the  computations  of  the  ut  are  independent  from  each  other  and 
can  be  performed  in  a  parallel  fashion.  Observe  also  that  MC  is  a  statistical  inference 
approach:  it  does  not  require  the  full  knowledge  of  the  joint  probability  law  of  the  field 
a,  but  only  a  sample  of  independent  instances.  If,  however,  these  instances  are  computer 
generated  (as,  e.g.,  in  numerical  simulations),  the  “simulator”  necessarily  contains  the  law 
of  a  in  some  (parametric)  form. 

•  On  the  negative  side,  the  convergence  estimate  (1.7)  is  only  in  an  expectation  sense  (al¬ 
though  u  is  a  deterministic  function). 

The  convergence  rate  (1.7)  of  1/2  for  the  Monte-Carlo  approximation  (1.6)  can  not  be 
improved,  in  general,  despite  the  fact  that  the  solution  u  depends  smoothly  on  a. 

Deterministic  methods:  These  have  been  studied  for  several  decades  (see  [8]  and  the  refer¬ 
ences  therein).  In  contrast  to  MC,  these  methods  take  advantage  of  the  smooth  dependence  of 
u  on  a.  We  distinguish  two  general  classes  of  deterministic  methods. 

The  perturbation  approach  is  based  on  the  Neumann  expansion  of  the  stochastic  solution 
around  its  mean  field,  and  successive  computations  of  the  terms  in  this  expansion  (see  [10]  and 
the  references  therein).  Such  methods  are  computationally  efficient  for  the  first  terms,  i.e.  the 
low  order  moments  of  the  solution,  yet  grow  in  complexity  for  higher  order  terms. 

The  spectral  approach  is  based  on  the  so-called  Wiener/generalized  polynomial  chaos  expan¬ 
sion  introduced  in  [22]  (see  also  [9]  and  [15]).  The  first  step  consists  in  representing  a  by  a 
sequence  of  scalar  random  variables  (yj)j> i,  usually  obtained  through  a  decomposition  of  the 
oscillation  a  —  a  into  an  orthogonal  basis  (ipj)j>i  of  L2(D): 

a(x,tu)  =  a(x)  +  ’^^yj(u)ipj(x).  (1.8) 

3>  i 

The  solution  is  now  viewed  as  a  function  u(x,  y)  where  x  £  D  is  the  space  variable  and  y  =  (yj)j> i 
is  a  vector  of  “stochastic  variables” ,  and  the  objective  is  to  compute  a  numerical  approximation 
to  u(x,y).  This  approach  provides  an  approximation  of  the  probability  law  of  the  solution 
and  therefore  gives  access  to  virtually  all  possible  information  on  its  probabilistic  behaviour. 
However,  one  is  facing  a  problem  of  high  -  possibly  infinite  -  dimension,  due  to  the  number  of 
coordinates  in  the  y  variable  refers  to  the  approximation  of  the  solution  in  this  variable  using 
tensor  product  polynomials. 

The  numerical  analysis  of  the  spectral  approach  began  only  recently.  When  the  vector  y 
has  finite  dimension  K ,  the  error  between  u(x,  y)  and  its  approximation  were  shown  in  [1]  to 
decrease  exponentially  with  respect  to  the  polynomial  degree  of  the  approximation.  However, 
the  derived  estimates  depend  heavily  on  K  as  K  grows.  Since  y  is  usually  of  infinite  dimension, 
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one  needs  to  incorporate  the  effect  of  its  truncation  to  a  finite  set  of  K  variables  in  the  error 
analysis,  and  K  has  to  grow  to  +oo  in  the  convergence  analysis.  There  is  therefore  a  crucial 
need  for  convergence  estimates  which  are  independent  of  K.  Such  estimates  were  established  for 
the  first  time  in  [21],  where  exponential  convergence  rates  independent  of  K  were  proved  under 
the  assumption  that  the  terms  in  the  expansion  (1.8)  decay  exponentially  to  0  in  the  L°°  norm. 
Here,  the  authors  specialized  on  the  Karhunen-Loeve  (KL)  expansion  for  (1.8),  and  the  desired 
decay  property  was  obtained  under  the  (strong)  assumption  that  the  covariance  function 

Ca(x,  y )  :=  E([a(x)  -  a(x)][a(y)  -  a(y)])  (1.9) 

has  analytic  smoothness  in  x  and  y. 

In  the  present  paper,  we  explore  the  more  realistic  setting  where  the  terms  in  (1.8)  only 
have  algebraic  decay.  Our  main  objective  is  to  design  an  approximation  scheme  which  converges 
with  rate  r  >  1/2  under  such  realistic  assumptions  on  the  random  input.  Our  analysis  is  not 
restricted  to  the  KL  expansion  and  will  therefore  be  carried  out  for  general  expansions  of  a  in  an 
orthogonal  basis  (ipj).  We  shall  analyze  the  dependence  of  the  solution  u(x,  y)  on  the  parameters 
y  and  thereby  show  that  u  has  an  expansion  into  a  polynomial  basis  with  coefficients  from  V. 
By  deriving  a  priori  bounds  on  the  decay  of  the  coefficients  of  u  in  such  an  expansion,  we  shall 
derive  algebraic  rates  of  convergence  for  the  spectral  approach  under  rather  mild  assumptions 
on  the  smoothness  of  a.  Our  analysis  is  independent  of  the  number  K  of  retained  variables  and 
depends  only  on  the  rate  of  decay  of  the  terms  in  (1.8).  A  key  feature  in  our  analysis  lies  in  the 
choice  of  a  particular  sparse  tensor  product  polynomial  space,  which  can  be  interpreted  as  a  form 
of  non-linear  best  IV-term  approximation.  Let  us  mention  that  other  strategies  for  selecting  the 
sparse  polynomial  spaces  in  the  variable  y  G  U  have  been  proposed  and  investigated  recently 
in  [12,  13]  based  on  the  concept  of  sparse  grid  introduced  in  [19].  A  specific  feature  of  our 
approach,  compared  to  these  strategies,  lies  in  the  optimal  choice  of  the  polynomial  space  which 
allows  us  to  relate  the  convergence  rate  r  to  the  rate  of  decay  of  the  random  input  expansion. 
Another  contrast  with  previous  works  is  that  we  adapt  the  level  of  spatial  discretization  to  each 
gpc  coefficient,  which  is  essential  in  order  to  obtain  an  optimal  overall  convergence  rate  in  terms 
of  the  total  number  of  degrees  of  freedom. 

Our  paper  is  organized  as  follows.  We  discuss  in  §2  the  general  properties  of  the  stochastic 
expansion  (1.8)  and  introduce  the  corresponding  parametric  PDE  induced  by  (1.1).  This  para¬ 
metric  problem  is  defined  for  parameters  y  £  U  where  U  is  the  set  of  all  sequences  (ijj)  with 
| yj 1  <  1.  In  §3,  we  introduce  the  measures  and  spaces  defined  on  U  which  are  the  setting  for  our 
approach.  This  is  followed  in  §4  by  deriving  bounds  on  the  partial  derivatives  of  u(x,  y)  with 
respect  to  the  variables  y3 .  A  general  Galerkin  scheme  for  the  approximation  of  u(x,y)  in  the 
y  variable  is  proposed  in  §5,  based  on  a  sparse  set  of  tensorized  Legendre  polynomials.  In  order 
to  study  the  convergence  of  this  scheme,  we  investigate  in  §6  bounds  on  the  exact  Legendre 
coefficients  in  the  expansion  of  u.  These  estimates  are  used  in  order  to  derive  the  convergence 
rate  of  the  Galerkin  scheme,  through  several  key  results  on  the  summability  of  multi-indexed 
sequences  which  are  established  in  §7.  Finally  we  discuss  in  §8  the  full  discretization  in  the  x 
and  y  variables  and  make  a  final  comparison  with  MC  methods.  Conclusions  and  perspectives 
are  raised  in  the  final  section. 
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We  emphasize  that  the  results  of  this  paper  show  that  solutions  to  stochastic  and  parametric 
equations  of  the  above  type  possess  enough  regularity  to  be  well  approximated  by  Galerkin 
subspaces  of  suitable  dimension.  The  problem  of  how  to  numerically  find  these  subspaces  is  not 
treated  although  we  make  some  remarks  on  this  interesting  problem  in  the  concluding  section 
of  this  paper. 


2  Basis  expansions  of  the  coefficient  a 

The  present  paper  is  based  on  the  spectral  approach  which  we  recall  begins  by  decomposing 
the  random  field  a  into  an  expansion  of  the  type  (1.8).  We  assume  throughout  this  paper  that 
(ipj)j>i  is  a  complete  orthogonal  sequence  in  L2(D)  (we  could  work  more  generally  with  any 
Riesz  basis  of  L2(D)).  Based  on  our  assumptions  on  the  random  field  a,  the  random  variables 

Vj  ■=  WjWfZ  j (a  —  a)ipj,  j  =  1,2, ... 

D 

are  P-measurable  functions. 

We  next  introduce  assumptions  concerning  the  convergence  of  the  expansion  (1.8)  in  the 
L°°(D )  norm.  These  assumptions  are  formulated  in  terms  of  the  summability  properties  of  the 
sequence  (\\yjipj\\L°°(D))j>i-  Up  to  a  renormalization  of  the  basis  functions  'ipj,  we  may  assume 
without  loss  of  generality  that  for  all  j  >  1  the  random  variables  yj  are  such  that  ||2/j||Loo(o)  =  1- 
Up  to  a  change  of  the  definition  of  a  on  a  set  of  measure  zero  in  P  this  is  equivalent  to 


sup  \yj{w)\  =  1.  (2.1) 

uj(E.£2 

The  vector  y  is  thus  supported  in  the  infinite  dimensional  cube 

U:=  [-l,lf, 

i.e.  the  unit  ball  of  £°°(N).  With  such  a  normalization,  our  assumptions  are  formulated  on  the 
sequence  (\\'f’j\\Loc(D))j>i-  Our  first  assumption  is  a  strengthening  of  Assumption  1. 


Assumption  2.  The  functions  a  and  ipj  satisfy 

^  ]  HVb  \\l°°(D)  ^  f—j—f  ®min i 
3>  1  K 

with  drain  ■=  minxeD  a(x)  >  0  and  n  >  0. 


(2.2) 


In  the  convergence  results  of  §7,  a  prescribed  value  of  the  constant  k  will  be  needed.  We 
can  view  Assumption  2  as  a  strong  ellipticity  assumption  on  a  which  requires  that  the  relative 
perturbation  of  a  by  the  series  Vji’j  U  n°t  t°°  large.  Clearly,  it  implies  Assumption  1  with 

_  k  _  1  _ 

®min  • —  ®min  Z  ;  ®min  —  Z  ;  Omin  P  0-  (2.3) 

1  +  K  1  +  K 
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Since  j^omin  =  fcamjn.  Assumption  2  also  implies 

x;  n^'ii  L°°(D)  <K.  (2.4) 

. .  ®min 

■7>1 

In  particular,  Assumption  2  and  the  value  of  k  is  independent  of  the  scaling  of  a. 

In  order  to  obtain  convergence  rates  r  >  !2  for  our  approximation  scheme,  additional  summa- 
bility  properties  are  needed  as  expressed  by  the  following  assumption. 

Assumption  3.  The  sequence  (H^H^oo^))  belongs  to  ^P(N)  for  some  p  <  1: 

I]  Wj\\PL°o(D)  <  +°° 

3>  1 


We  next  discuss  possible  choices  for  the  basis  ('fj)j>i-  Since  the  main  objective  is  to  describe  ac¬ 
curately  the  diffusion  coefficient  a  with  as  few  parameters  yj  as  possible  and  to  fullfill  the  above 
summability  assumptions,  this  choice  should  be  tied  to  the  properties  of  this  random  field.  On 
the  other  hand  this  basis  will  enter  the  computation  of  the  solution  and  should  therefore  be  easy 
to  manipulate  numerically. 

An  important  example  is  the  Karhunen-Loeve  basis  of  the  L2(D)-orthogonal  eigenfunctions 
of  the  covariance  integral  operator 

g  Tg(x )  :=  J  Ca(x,y)g(y)dy, 

D 

where  Ca  is  the  covariance  function  (1.9).  We  index  these  eigenfunctions  in  decreasing  order 
of  the  corresponding  eigenvalues.  These  functions  are  well  defined  for  any  domain  D.  In  the 
particular  case  where  D  is  a  fundamental  period,  D  =  [0,  l]d  say,  and  a  is  a  stationary  and 
.D-periodic  random  field,  i.e.  its  covariance  function  has  the  form  Ca(x,  y)  =  A(x  —  y)  where 
A  is  D-periodic,  then  T  is  a  convolution  operator  and  the  KL  basis  is  the  Fourier  basis.  In 
general,  the  KL  expansion  has  properties  which  emulate  those  of  Fourier  series.  In  particular, 
the  decay  of  the  terms  and  the  rate  of  convergence  of  the  KL  expansion  are  dictated  by  the 
average  regularity  of  the  field  a  measured  by  the  smoothness  properties  of  Ca.  We  refer  to  [21] 
for  such  results  when  Ca  is  analytic  and  to  [20]  for  similar  results  with  less  regular  kernels. 

In  the  case  of  a  one-dinrensional  Fourier  expansion, 

a(x,  t u)  =  a(x)  +  ^2  c u)e*27rfcr, 

kez 

it  is  known  that  if  the  function  a(-,uj)  —  a  is  in  Lip(s,L1)  for  some  s  >  1,  then  its  Fourier 
coefficients  satisfy  the  decay  estimate 

\a(k,co)\  <  C\k\~s,  \k\  >  1, 
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with  C  depending  on  the  Lip(s,  L1)-norm  of  a{-,ui)  —  a.  Assuming  that  this  norm  is  bounded 
independently  of  u  and  reindexing  the  expansion  on  j  >  1  with  the  normalization  (2.1),  we  thus 
obtain 

Uj\\Loo{D)<Cj-s ,  j  =  1,2, ...  (2.5) 

Therefore  £p  summability  of  the  sequence  (\\ipj\\Loo(D))j>i  is  ensured  when  s  >  j-.  In  the  multi¬ 
variate  case,  the  rate  of  decay  (2.5)  is  modified  to  j~s/d.  In  summary,  Assumption  2  and  3  can 
be  derived  from  the  smoothness  properties  on  the  field  a. 

In  the  nonperiodic  case,  the  KL  eigenfunctions  are  in  general  not  analytically  available. 
However,  approximations  of  them  can  be  computed  efficiently  numerically;  see,  e.g.  [18]  for 
algorithms  based  on  fast  multipole  approximations  of  covariance  operators  T  when  Ca(x,  y )  is 
analytic  for  x  ^  y.  There  exist,  however,  many  other  basis  expansions  for  which  similar  decay 
properties  hold  when  a  has  some  smoothness,  in  particular  wavelet  expansions  which  can  be 
constructed  on  fairly  arbitrary  domains,  see  [4]  for  a  general  treatment.  One  can  carry  out  for 
wavelet  bases,  the  same  analysis  as  described  above  for  the  Fourier  basis.  For  example,  in  the 
univariate  case,  the  decay  rate  (2.5)  is  now  satisfied  if  a(-,  u)  —  a  belong  to  the  Holder  space  Cs 
with  their  C^-norm  bounded  independently  of  u. 

In  summary,  the  basis  V’j  should  be  taken  with  an  eye  towards  two  issues.  The  first  is  that  it 
should  be  easy  to  manipulate  numerically.  The  second  is  that  the  infinite  dimensional  vector  y 
has  components  yj  which  decrease  rapidly  as  j  grows,  with  the  rate  of  decay  being  determined 
by  the  smoothness  of  the  field  a. 

For  each  y  6  U,  we  define 

a  =  a(x,y)  :=a  +  ^2yjipj(x),  x£D,y£U.  (2.6) 

j>  i 

Because  of  Assumption  2  ,  the  series  (2.6)  converges  absolutely  and  uniformly  on  D  x  U. 
Notice  that  a(x,  y)  is  defined  for  all  y  €  U  and  not  just  for  the  y(uj)  which  are  the  image  of  some 
lo  £  H.  In  particular,  we  have 

a(x,y)  >  amin,  (2.7) 

for  all  y  G  U  with  amin  defined  by  (2.3).  In  the  sequel,  we  will  use  a  to  denote  both  a(x,  u) 
and  o(x,  y)  but  which  of  these  is  being  employed  will  be  clear  from  the  context.  Similarly  y  will 
denote  both  the  stochastic  basis  coefficients  y( uj)  as  well  as  a  general  point  in  the  parameter 
space  U. 

3  Probability  spaces  on  U 

Since  U  is  an  infinite  product  of  the  intervals  [—1,1]  some  care  must  be  taken  in  defining 
probability  measures  on  U.  We  shall  have  need  for  two  measures.  The  first  of  these  is  the 
infinite  tensor  product  measure  dy  of  the  univariate  uniform  probability  measures  on  [—1,1]: 


dy(y)  =  ®j>\dyj/2. 


(3.1) 


Recall  that  the  sigma  algebra  for  dp  is  generated  by  the  finite  rectangles  njli  <S 'j,  where  only 
a  finite  number  of  the  Sj  are  different  from  [—1,1]  and  those  that  are  different  are  intervals 
contained  in  [—1, 1],  Then  (U.  Q,dp ;)  is  a  probability  space. 

We  shall  also  need  a  measure  p  defined  on  U  which  is  induced  by  the  mapping  u ;  — *  y(u>). 
This  measure  is  defined  on  the  same  sigma  algebra  0  as  for  the  uniform  measure  discussed 
above.  Consider  any  finite  rectangle  <S)^L i  Sj ,  where  Sj  =  [—1,1]  for  all  j  >  n  for  some  n.  We 
define 

n 

p(S):=l[P{u:yj(u)€Sj}.  (3.2) 

3= 1 

Then,  p  extends  to  a  measure  defined  on  all  sets  in  the  sigma  algebra  0.  This  gives  the  measure 
space  ( U,Q,p ).  Given  these  measure  spaces,  we  introduce  for  1  <  p  <  oo  the  Banach  spaces 
Lp(U,dp ',)  and  Lp(U,dp).  For  the  (separable)  Hilbert  space  V,  we  denote  by  LP(U,V,  dp)  and 
LP(U ,  V,  dp)  the  corresponding  Bochner  spaces  of  p-summable  mappings  from  U  to  V,  equipped 
with  their  corresponding  norms.  For  example, 

Ml *(u,v,dp)  '■=  j  \\v(^y)\\vMy)  =  j (  j \Vv(x,y)\2dx^Jdp{y).  (3.3) 

U  U  D 

Here  and  in  the  following,  V  is  understood  to  be  applied  in  the  x  variable. 

We  shall  also  need  certain  orthogonal  bases,  built  from  Legendre  polynomials,  for  some  of 
these  spaces.  Let  (Ln)n>o  be  the  univariate  Legendre  polynomials  normalized  according  to 

i 

j  |L„(t)|2f  =  1.  (3.4) 

-1 

or  equivalently 

max  \Ln(t)\  =  \/2n  +  l.  (3-5) 

*6[— 1,1] 

Recall  that  Ln  is  an  algebraic  polynomial  of  degree  n  and  the  family  (Ln)n> q  is  a  complete 
orthogonal  system  for  L2[— 1,1]. 

We  introduce  the  countable  set  T  of  all  sequences  v  =  {vj)j> i  of  nonnegative  integers  such 
that  only  finitely  many  Vj  are  non-zero.  For  all  v  E  T,  we  use  the  notation 

M  :=  =  IWh1’ 

3>  1 

and 

v\  =  rj  Vj ! ,  v  €  T . 

3>  1 

We  define  the  tensorized  Legendre  polynomials  by 

Lu{y)  =  YlLuj{yj).  (3.6) 

j>i 

By  construction,  the  family  {Lv)u&jr  forms  an  orthonormal  system  in  L2(U,  dp).  Since  Lo(t)  =  1, 
and  any  v  £  T  has  only  a  finite  number  of  nonzero  entries,  the  function  Lu(y)  only  depends  on 
finitely  many  y3  (namely  those  j  such  that  Uj  ^  0). 
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The  family  {Lv)v&jr  is  easily  seen  to  be  complete  in  L2(U,  d/i).  Indeed,  any  function  in 
L2(U,  dp)  can  be  approximated  to  any  given  tolerance  by  a  finite  linear  combination  of  charac¬ 
teristic  functions  of  finite  rectangles  and  each  characteristic  function  of  a  finite  rectangle  can  be 
approximated  by  polynomials  to  any  prescribed  accuracy.  Therefore  (Lu)u^fr  is  an  orthonormal 
basis  of  L2(U,  dp).  In  turn,  each  v  £  L2(U,  V ,  dp)  has  a  representation 


v  =  y;  vvLv,  where  vu=  g(-,y)Lu(y)dp(y )  £  V 


and  |M| L2{uy4fl)  =  || (|K || 


(3.7) 


4  Parametric  expansion  of  u 

Suppose  that  we  have  an  orthogonal  system  (ifj)j> o  such  that  Assumption  2  holds  and  a  is 
defined  by  (2.6).  We  denote  by  u(x,y )  the  solution  to 

-V  •  (aVu)  =  f  in  D,  u\dD  =  0,  (4.1) 

where  D  C  is  the  Lipschitz  domain  introduced  earlier.  Since  Assumption  2  implies  the  lower 
bound  (2.7),  the  equations  (4.1)  are  uniformly  elliptic  in  y  £  U,  and  we  have 

INI L°°{U,V)  =  IMI L°°(U,V-,dfi)  :=  SUP  \\u(-iy)\\v  <  B  (4-2) 

y&U 

with  B  as  in  (1.4).  Throughout  in  what  follows,  the  expression  L°°(U,V )  shall  be  understood 
in  the  sense  (4.2),  also  for  different  choices  of  the  space  V. 

In  this  section,  we  fix  /  and  examine  the  smoothness  of  u  as  a  function  of  the  parameter 
vector  y  £  U.  We  shall  establish  generic  a-priori  bounds  for  \\dyU\\Loo(uyy  These  bounds  could 
possibly  be  improved  when  working  with  a  specific  orthogonal  system  (V;j)j>i  such  as  a  wavelet 
system.  However,  at  this  stage  we  do  not  want  to  complicate  the  presentation  by  pursuing  such 
avenues. 

As  a  first  step,  we  prove  the  existence  in  V  of  the  partial  derivatives  dyU  at  any  y  £  U.  For 
this  purpose,  we  need  the  following  stability  result. 

Lemma  4.1  If  u  and  u  are  solutions  of  (1.3)  with  the  same  right  hand  side  f  and  with  coeffi¬ 
cients  a  and  a,  respectively,  and  if  these  coefficients  both  satisfy  the  assumption  (1.2)  with  the 
same  lower  bound  am;n,  then 

ii  ~n  /  \\f\\v*  I,  ~n  /A 

||ti  —  U\\v  S  — 2 - lla  ~  all L°°(D)-  (4-3) 

amin 

Proof:  Substracting  the  variational  formulations  (1.3)  for  u  and  u,  we  find  that  for  all  v  £  V, 

dVu  •  Vv  =  j  a(Vn  —  Vu)  ■  Vv  +  J (o  —  a)Vu  ■  Vv.  (4.4) 

D  D  D  D 
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Theorem  4.2  At  any  y  £  U,  the  function  y  e- >  u(y)  admits  a  partial  derivative  dyu(y )  £  V  for 
any  v  £  T . 

Proof:  We  start  by  proving  the  existence  of  the  first  order  derivative  dy.u(y)  for  any  j  >  1  and 
y  £  U.  We  denote  by  the  Kronecker  sequence  with  1  at  index  j  and  0  at  other  indices.  For 
h  £  R  \  {0}  we  consider  the  difference  quotient 

My)  =  u(y  +  hcj)-u(y)^  (4.5) 

We  notice  that  this  quotient  is  well  defined  for  h  small  enough:  if  I  ^  1 1 1  Vi?  1 1 L00  (D)  <  we 

clearly  have  for  all  y  £  U, 

^  <  a(x,  y  +  he,)  <  amax  +  ^,  x  £  D, 

and  therefore  u(y  +  hef)  is  well  defined  as  an  element  of  V.  For  such  a  small  enough  h,  we  have 
for  all  v  £  V, 

0  =  f  a(x,  y  +  hej)Vu(x,  y  +  hej)  ■  Vv(x)dx  —  f  a(x,  y)Vu(x,  y)  ■  Vv(x)dx 

D  D 

=  h  f  a(x,  y)Vwh(x ,  y )  •  \7v(x)dx  +  f  (a(x,  y  +  hej )  —  a(x,  y))Vu(x,  y  +  hej)  ■  \7v(x)dx 

D  D 

=  hf  a(x,  y)S7wh(x,  y)  ■  \7v(x)dx  +  h  f  ifj{x)Vu{ x,  y  +  hej)  ■  \7v(x)dx 

D  D 

and  therefore  vjf-Jy)  is  the  unique  solution  to 


J  a(x,y)'Vwh{x,y)  ■  Vv(x)dx  =  Lh(v),  for  all  v  £  V, 

D 

where  :  v  — >  Lh(v)  :=  —  f  ifj(x)Vu(x,y  +  hej)  ■  \7v(x)dx  is  a  continuous,  linear  functional 

D 

on  V.  The  linear  functional  L/l(-)  varies  continuously  in  V*  with  h  as  h  tends  to  0:  indeed,  we 
have  for  all  v  £  V, 


I Lh{v)  -  Lq(v)\  =  |  J  ipj(Vu(y  +  hej)  -  Vu(y))  ■  Vu|  <  ||V’j||z,«>(D)IKy  +  hej)  -  u(y)\\v\\v\\v, 

D 

and  since  the  stability  estimate  (4.3)  implies 

II  u(y  +  hej)  -  u(y)\\v  =  ||  Vu(y  +  hej)  -  Vu{y)  ||L2(D)  <  \h\ \\ifj  ||l°°(d)  , 

Qmin 
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it  follows  that  Lh  converges  towards  Lq  in  V*  as  h  — >  0.  Therefore  Wh  converges  in  V  towards 
tco,  which  is  the  solution  to 

J  a(x ,  y)'Vwo(x)  ■  Vv(x)dx  =  Lq(v),  for  all  v  G  V. 

D 

Hence  dyju(y)  =  wo  exists  in  V  and  is  the  unique  solution  of  the  variational  problem 

J  a(x,  y)Vdyju(x,  y)^Jv(x)dx  =  —  j  ipj(x)'Vu(x,y)'Vv(x)dx,  for  all  v  G  V.  (4.6) 


D 


D 


We  notice  that  this  problem  is  obtained  by  formally  differentiating  the  variational  formulation: 
given  /  G  V*  and  y  G  t/,  find  u(-,y)  G  V  such  that 

J  a{x,  y)Vu(x,  y)Vv(x)dx  =  j  f(x)v(x)dx  ,  for  all  v  G  V  (4-7) 


D 


D 


with  respect  to  the  variable  yj.  By  the  same  reasoning,  we  can  inductively  derive  the  existence 
in  V  of  higher  order  derivatives  dyu{y).  These  derivatives  are  solutions  of  variational  problems 
obtained  by  further  differentiating  (4.7).  o 

We  now  estimate  the  norms  \\dyU\\Loo(jjy^,  For  this  purpose,  we  introduce  the  following 
notation.  If  a  =  (otj)j>i  is  a  sequence  of  positive  numbers,  we  define  for  all  u  G  T 

n"  :=Ua7- 

j>  i 

We  also  use  the  following  sequence  b  throughout  the  remainder  of  this  paper: 

""M  L°°(D) 


b=(bj)?=1,  bj:= 

o-min 

Theorem  4.3  With  the  constant  B  as  in  (1-4),  we  have 

\\dyuh°°(u,V)  <  B\v\\bu  Mo  G  7. 


(4.8) 


(4-9) 


Proof:  As  a  first  step,  we  study  the  variational  problems  satisfied  by  the  partial  derivatives 
dy(y).  We  claim  that  these  problems  have  the  general  recursive  form 

/  «(*,  »)V^x, ») v«(x)d,  =  -  E  /  *(x) var>«(»,  »)V«(x)«fc.  (4.10) 


{J-  ^¥=0}  D 

where  ej  is  again  the  Kronecker  sequence  with  value  1  at  position  j  and  0  elsewhere. 

We  prove  (4.10)  by  induction  on  \v\.  When  \v\  =  1  this  is  (4.6).  For  \v\  >  1,  let  k  be  any 
index  such  that  7^  0,  we  define  v  =  v  —  which  satisfies  \v\  =  \v\  —  1.  By  the  induction 
hypothesis,  we  have  for  all  v  G  V 

j  a(x,  y)Vd*u(x,  y)Vv{x)dx  +  ^  f  ^(x)VS^-eiu(a;,  y)Vv{x)dx  =  0, 

d  1  f-  *i¥0}  D 
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where  Vj  =  Vj  if  j  ^  k  and  fg,  =  v\~  —  1.  Differentiating  with  respect  to  ijk,  we  obtain 
0  =  j  a(x,  y)VdyU(x,  y)Vv(x)dx  +  J  ipk{x)\/dy~eku(x,  y)Vv(x)dx 

D  D 

+  ^  Vj  /  ,tpj(x)'Vdy  ej u(x,y)'Vv(x)dx  +  (vk  —  1)  /  ipk(x)Xdy~eku(x,y)'Vv(x)dx, 

r  .•  / 1.  . .  /  J  J 


{j^k:  0}  D 


D 


which  is  equivalent  to  (4.10).  Selecting  in  (4.10)  the  function  v(x)  =  8yU(x,y)  €  V,  and  using 
both  ellipticity  and  continuity  of  the  bilinear  form,  we  obtain 

amin\\dyu(-,y)\\y  <  J  a(x,y)\VdyU(x,y)\2dx 
D 

=  —  ^2  uj  f  ^j{x)Xdy  £ju(x,  y)X7dyii(x,  y)dx 

{j-.Uj^O}  D 


<  Y  UMj\\L™(D)\\dyU{-iy)\\V\\dy~ 

{j-.Uj^O} 


1 u( ■ 


in  )  ii  V) 


Vjbj\\dy  ju(-,y)\\v. 


(4.11) 


and  therefore 

\\ByU(-,y)\\v  <  Y 

{w*  0} 

Using  (4.11),  we  now  prove  (4.9)  by  induction  on  \v\.  For  \v\  =  0  this  bound  is  simply 
\\u(-,y)\\v  <  B  with  B  as  in  (1.4)  which  is  known  from  (4.2).  For  \v\  >  0,  we  combine  (4.11) 
with  the  induction  hypothesis.  This  yields 

K<;v)\\v  <B  Y  "iMM-i W~e‘  =  B(  Y  ^)(H  - 1)»"  = 

{j:  Vj± 0}  {j:  Vj+ 0} 

which  concludes  the  proof.  o 


5  Galerkin  approximation 

In  this  section,  we  shall  introduce  a  numerical  approach  for  the  computation  of  u(x,y).  We 
assume  that  we  have  full  knowledge  of  dp  (which  may  or  may  not  be  the  case  in  a  given  applica¬ 
tion).  Obviously  u  belongs  to  L2(U,V,  dp)  (since  ||w(-,y)||y  is  uniformly  bounded  with  respect 
to  y  €  U)  and  it  can  be  defined  as  the  unique  solution  of  the  variational  problem: 


Find  u  €E  L2(U,  V,  dp)  such  that  B(u,v)  =  F(v)  Vu  €  L2(U,  V,  dp),  (5-1) 


where 


B(u,v)  :=  J  (J  a(x,y)Vu(x,y)  •  Vv(x,y)dx)dp{y)  and  F(v)  :=  J  (|  f{x)v{x,y)dx)dp{y). 


U  D  U  D 

For  any  subset  A  C  T  of  finite  cardinality,  we  define  the  approximation  space 
X\  :=  {uA(a :,y)  =  Yv’'(x)Lv(y)  1  vveV}, 


(5.2) 
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where  {Lv}u&jr  is  the  basis  of  Legendre  polynomials.  Note  that  X\  C  L°°(U ,  V)  C  L2{U ,  V,  dp). 
We  define  the  Galerkin  approximation  u\  =  XXeA  '“t'-k/  £  X\  to  u  as  the  unique  solution  to 
the  problem:  find 

£  X\  such  that  B(u\,  v\)  =  F(v\)  Vua  £  X\.  (5.3) 

Just  as  for  the  MC  method,  the  evaluation  of  u\  requires  the  computation  of  N  deterministic 
functions  uv  where  N  :=  ff(A),  and  the  computation  of  these  functions  requires  in  addition 
some  spatial  discretization.  We  postpone  discussion  of  the  spatial  discretization  until  §6  and 
first  focus  our  analysis  on  the  discretization  in  the  y  variable.  Namely,  we  search  for  appropriate 
choices  of  A  with  the  goal  of  obtaining  error  estimates  of  the  form 

Ik  -  || L2{u,v,dp)  ^  CN  r,  (5.4) 

for  the  largest  possible  r  >  0. 

Remark  5.1  We  can  derive  from  u\  an  approximation  to  the  mean  field  u  =  E(u)  which  are 
given  by 

ua  =  E(ua)  =  ^  euu„,  (5.5) 

l/£  A 

with  the  u-th  moments  ev  :=  E (Lu(y))  =  f  Lu(y)dp(y)  (although  the  mean  here  is  taken  with 

u  __ 

respect  to  y  and  the  measure  dp  it  is  easily  seen  that  it  results  in  the  same  means  u  as  averaging 
with  respect  to  u)  and  P)  .  By  the  triangle  and  Cauchy- Schwarz  inequalities, 

\\u-ua\\v<  J  \\u(-,y)  -  UA(-,y)\\vdp(y)  <  \\u- UA\\L2(Uy4p).  (5.6) 

u 

Therefore  the  rate  of  the  spectral  Galerkin  approximation  (5.3)  will  outperform  the  rate  (1.7)  of 
the  MC  estimate  (1.6)  for  the  mean  field  E(u)  in  terms  of  the  number  N  of  coefficients  in  V  to 
be  determined,  if  r  >  }2  in  (5.4). 

Remark  5.2  Our  approach  implicitly  assumes  that  we  have  the  full  knowledge  of  p  or  equiva¬ 
lently  of  P,  in  contrast  to  the  MC  method  which  only  needs  a  sample  of  independent  realizations. 
In  the  case  where  we  only  have  such  a  sample  (y1,  ■  ■  ■  ,yM)  £  UM  at  our  disposal,  we  can  adapt 
our  approach  by  solving 

Bm(ua,va)  =  Fm(v  a),  (5.7) 

in  place  of  (5.3)  ,  where  Bm  and  Fm  are  defined  by  replacing  the  integrals  of  the  type  f  f{y)dp(y) 
in  (5.2)  by  their  empirical  counterpart  YliL i  f(y '*)■  We  shall  not  embark  in  the  error  analysis 
of  this  variant  and  proceed  with  the  assumption  that  p  is  known  to  us. 

Remark  5.3  An  alternative  to  Galerkin  discretization  would  have  been  to  start  from  an  or¬ 
thonormal  basis  of  L2(U,dp)  instead  of  L2(U,dp).  However  such  a  basis  is  not  always  simple  to 
construct  when  p  is  not  separable  and  therefore  we  maintain  the  choice  of  the  Legendre  polyno¬ 
mials  even  when  p  differs  from  p.  As  we  shall  see  later,  sharper  error  estimates  can  be  obtained 
in  the  particular  case  where  p  =  p,  i.e.  when  the  random  variables  yj(uj )  are  independent  and 
uniformly  distributed  on  [—1,1]. 
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Remark  5.4  An  alternate  approach  to  Galerkin  discretization  is  collocation:  find  u\  G  X\ 
such  that 

I  a(x,y)'Vu\(x,y)'Vv(x)dx  =  j  f(x)v(x)dx,  (5.8) 

D  D 

for  all  v  €  V  and  for  all  y  G  5a  where  S\  C  U  is  a  set  such  that  ff(S\)  =  N.  This  approach  is 
however  more  difficult  to  analyze,  since  its  well-posedness  is  strongly  tied  to  an  optimal  choice 
of  S\. 


We  begin  our  error  analysis  by  observing  that  according  to  Cea’s  lemma  (see  e.g.  [3]),  we 
have  the  estimate 

(5.9) 


\u  ~  || L2(U,V,dp)  <  Cl  \\u  ~  wa|| L2(U,V,dp)i 


where  C\  :=  /“bus.  In  order  to  proceed  further,  we  introduce  the  exact  expansion  of  u  in  the 

\I  ^min 

basis  (Lu)u£fr  (see  (3.7)): 

u(x,y)  =  ^  cv(x)Lv(y)  where  cv  :=  f  u(-,y)Lv(y)dp(y)  G  V. 


vGT 


u 


We  infer  from  (5.9)  that 


\u  -  “A  II L2(U,V,dp)  <  Cl  || U  —  ^2  cvLu\\l2(U,V4p)- 

uGA 


(5.10) 


The  right  hand  side  of  (5.10)  can  be  bounded  in  different  ways  depending  on  the  properties  of 
p  with  respect  to  p. 

•  Case  1:  if  p  =  p,  we  can  invoke  Parseval’s  equality  which  yields 


\u~  J2c^hHu,v,dp)  —  \^2  WCvWv 

vgA  v^A 


i 

2  1  2 


,2  \  2 


and  therefore 

\\u  —  uA\\L2(Uy,dp)  <  Ci(x:  \\Cv\\v 

v<£A 

We  also  reach  (5.11)  up  to  a  change  in  the  constant  C\  if  dp  =  wdp  with  w  G  L°°(U). 
Case  2:  in  the  case  of  general  p,  we  can  still  write 

\\u~  yi  CyLu\\ LHU,V,dp)  <  ll^-  CvLv\\LOO(uy), 

VGA  i/GA 


(5.11) 


so  that  by  triangle  inequality,  we  obtain 


u  - 


ua\\l2(U,V4p)  —  Cl  ^2  1 1 1 1 1 1  1 1  (C/)  - 

ufA 


(5.12) 
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The  estimates  (5.11)  and  (5.12)  suggest  to  choose  for  A  the  sets  of  indices  v  corresponding 
respectively  to  the  N  largest  values  of  ||c^||y  and  ||cj,|| v- 1| ||i,°°(cr) -  Of  course  the  exact  value 
of  |c„||y  is  unknown,  and  therefore  a  more  reasonable  objective  is  to  build  A  based  on  a-priori 
bounds  for  ||cj,||y.  We  shall  derive  such  bounds  in  the  next  section.  The  process  of  approximating 
a  sequence  by  retaining  its  N  largest  terms  is  a  simple  instance  of  nonlinear  approximation  (see 
[7]  for  a  general  survey)  known  as  best  N -term  approximation.  The  rate  of  convergence  of  this 
process  is  well  understood,  thanks  to  the  following  result  of  Stechkin  whose  proof  is  elementary 
(e-g.  [7]). 


Lemma  5.5  Let  0  <  p  <  q  and  a  =  (au)u£^  be  a  sequence  in  £P(£F).  If  IFn  is  the  set  of  indices 
corresponding  to  the  N  largest  values  of\au\,  we  have 

(  X/  —  IMI ep(Jr)N~ri 

v^Tn 


where  r  0. 

p  <?  — 


The  a-priori  bounds  that  we  shall  obtain  in  the  next  section  will  also  be  used  to  analyze  the 
summability  of  the  sequence  (||cy||y)  in  £2  and  of  (||cj,||y  H-Lj/lh00!!/))  in  l1,  and  therefore  derive 
the  rate  of  convergence  r  >  0  in  (5.4)  based  on  the  estimates  (5.11)-(5.12)  combined  with  the 
above  lemma. 


6  The  decay  of  the  Legendre  coefficients  of  u 


The  decay  of  the  Legendre  coefficients  of  a  function  depends  on  its  smoothness.  For  example, 
one  simple  way  to  relate  cv  to  dyU  is  through  Rodrigues’  formula  which  reads 


Ln(t )  = 


(— l)V2n  +  l  (  d 


2  nn\ 


when  the  Legendre  polynomials  are  normalized  according  to  (3.4).  For  a  function  f(t)  of  one 

variable  which  is  n-times  continuously  differentiable,  we  can  apply  n  integrations  by  parts  and 

l 

obtain  a  bound  for  the  coefficient  cn  :=  J  f(t)Ln(t)f^\ 

-l 


C-n 


ft  1  +2 

2  nn\  7  1  ’  1  2 


< 


2nn! 


||/(n)| 


with 

l 

In  :=  s/2ff+l  /(l-  t2)n 
-l 

The  sequence  In  is  uniformly  bounded.  However,  it  will  be  sufficient  for  us  to  employ  the  crude 
bound  In  <  If  with  I\  =  |\/3  ~  1.155,  and  therefore  obtain 

on 

Kl  £  Or ll/(”)IU»<[-i,i]). 


dt 


V2^Ti]J 


k=  1 


2k 

2k  +  1 


if  n  >  1,  Iq  :=  1. 
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with 


(3:=  jj/2  =  1/V3  w  0.577.  (6.1) 

This  fixes  (3  for  the  remainder  of  this  paper.  Applying  similar  arguments  to  u(x,  y )  in  each 
variable  yj  yields 

g\v\ 

\\cv\\v  <  -^\\dyu\\Loo(jjiVy  (6.2) 

We  estimate  the  quantities  ||ci/||v'||LI/||x,oom)  in  a  similar  way,  replacing  In  by  Jn  =  \/2n  +  1  In 
and  using  the  crude  bound  Jn  <  2n.  This  leads  to 

\\cv\\v \\Lu\\l°°(u)  <  -^\\dyu\\L°°{u,V)-  (6-3) 


Combining  (4.9)  with  (6.2)  and  (6.3),  we  obtain  the  following. 

Corollary  6.1  Let  b  =  (bj)j> i  and  d  =  (dj)j> \  be  defined  by  bj  :=  and  dj  =  (3bj.  We 

then  have  with  B  as  in  (1.4)  for  all  v  e  T 


and 


\cu\\v  < 

u\ 


(6.4) 


\v\\  - 


\cv\\v\\Lv\\loo{u)<B'-±.V'. 


(6.5) 


7  Sequence  approximation 


As  explained  at  the  end  of  §5,  the  rate  of  convergence  N~r  of  the  spectral  approach  based  on  the 
optimal  choice  of  A  is  related  to  the  properties  of  l ,p  summability  of  the  multi-indexed  sequences 
(Ilc^Hy)^^  and  (Hc^Hy \\Lv\\l°°(u))v&f-  In  view  of  the  estimates  obtained  above  for  ||c;,||y  and 
||c!/||vr||Ly||i/oo(t/),  we  need  to  study  the  lp  summability  of  multi-indexed  sequences  which  have 
the  general  form 


( 


v\ 


-au)uep 


(7.1) 


where  a  =  {otj)j>i  is  a  sequence  of  positive  numbers.  Since  the  rate  is  either  given  by  r  =  I  —  \ 
or  r  =  I  —  1  (depending  on  the  relation  between  the  measure  p  and  the  uniform  measure  p)  and 
since  we  are  interested  in  understanding  under  which  circumstances  r  may  be  larger  than  we 
need  to  consider  values  of  p  smaller  than  1 . 

In  this  section,  we  establish  simple  necessary  and  sufficient  conditions  on  a  sequence  a  for 
the  £p  summability  of  the  multi-indexed  sequence  (7.1).  Our  first  result  deals  with  sequences 
which  have  the  simpler  form  (aI')I/ejF. 


Lemma  7.1  For  p  <  1,  the  sequence  (cd')I/ejF  belongs  to  £p(tF)  if  and  only  if  a  £  t?p(N)  and 
1 1  <a  1 1  ^oo  (pj)  <  1.  Under  these  conditions,  we  have 


a 


{aU)\Ur(T)  <  eXPi^ 


IP 

l£p(N) 


-  a 


:}• 


(7.2) 
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Proof:  Assume  first  that  (au)u^.jr  belongs  to  £p(tF).  By  remarking  that  for  v  =  ej  the  Kroeneker 
sequence  av  =  ctj,  we  find  that  a  G  fp(N)  with  ||a||^P(N)  <  ||(aI/)||£P(jp).  For  each  fixed  j,  the 
sequence  {av)v&jr  contains  (ct”)n> o  as  a  subsequence  corresponding  to  the  indices  v  =  nej  and 
hence  we  must  have  aj  <  1.  From  the  fact  that  a  G  fp(N),  we  see  that  ctj  — >  0  as  j  —*  +oo  and 
hence  ||q||^oo^\  <C  1. 

Conversely,  if  a  G  fp(N)  and  ||cr||^oo(pj)  <  1,  we  can  write 

= e<«t = e  n  »r = n  E«r = n  ^ 

u^J7  vEJ7  j>  1  j>ln>0  j>  1  3 


where  we  have  used  that  ctj  <  1  for  all  j  >  1.  In  order  to  prove  the  convergence  of  the  infinite 
product,  we  remark  that 

1  v  <  1  +  noE,  j  =  1,2, ...  where  k  :=  - 1 ^ - 

1  1  llall^°o(N) 

so  that  we  can  write 


log 


<  log(l  +  naPj)  <  K  ^2  aj  = 

j> i  j> i 


a 


£r(N) 


1  -  llollp 

1  llulb°°(N) 


which  implies  the  bound  (7.2). 


o 


We  next  turn  to  multi-indexed  sequences  which  have  the  form  (7.1). 


Theorem  7.2  For  p  <  1,  the  sequence  (^fa1')„6f  belongs  to  £p(tF)  if  and  only  if  ||a||^i(N)  <  1 
and  a  G  fp(N).  One  has  the  estimate 


,\v\\  .....  2  / 

)\\*W  ^  “exP( 


'2(1  -p)(J(v)  +  IMI^(N)) 


p2f] 


(7.3) 


where  77  :=  (1  —  ||a||^i(N))/2  and  J(rj)  is  the  smallest  positive  integer  such  that  Y!j>j  \aj\p  55  \- 


Proof:  First  notice  that  ||(^a1')  does  not  change  if  we  rearrange  the  entries  in  a  and 

therefore  we  may  without  loss  of  generality  assume  that  the  nonnegative  sequence  a  is  decreasing. 
For  p  =  1,  we  have 

OO  OO 

=  E<E“A  = 

k= 0  j= 1 

which  gives  the  theorem  in  this  case.  So  we  consider  further  only  the  case  p  <  1. 

Assuming  that  ( belongs  to  £p(tF),  we  notice  that  the  sequence  contains 

a  =  (ctj)j>  1  as  subsequence  corresponding  to  the  indices  v  =  e^,  and  therefore  the  iv  summability 
of  a  is  necessary.  On  the  other  hand,  since  p  <  1,  the  sequence  (^cd')i,ejF  belongs  to  tp{T) 
only  if  it  is  summable,  i.e.  it  belongs  to  £1(^r).  Hence,  (7.4)  gives  that  1 1 ck 1 1 ^1  <  1  is  necessary. 

Conversely,  let  a  be  a  sequence  such  that  a  G  fp(N)  and  ||a||^i/N\  <  1,  we  shall  construct  a 
factorization  of  a  as  aj  =  7 jSj,  j  =  0, 1, . . .,  with  sequences  7  and  5  satisfying 


a 


(7.4) 


Il7ll/1(N)  <  !>  II<J||/~(N)  <  !>  II^II^'(N)  <  °°>  p  :=p/(l -p)  >  0-  (7-5) 
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Having  such  a  factorization  of  a  at  hand,  we  estimate 


£( 

v&T 


u\\ 


-a 


V\P  — 


v&T 


< 


WW  ,,\P 


£^T  £* 


S^A1  i’ 

> i -p 


v&T 


-7 


v 

P-{T) 


nilp 

n\(pi  (jp) 


<  [i  -  IMI^cn)]  PexP 


(i- 


p 

tv'  (N) 


1  - 


||  P 

ih°°(N) 


(7.6) 


where  we  have  first  used  Holder’s  inequality  and  then  we  employed  (7.4)  on  the  first  term  and 
Lemma  7.1  (with  p'  in  place  of  p  and  5  in  place  of  a )  on  the  second. 

It  remains  to  construct  the  factor  sequences  5  and  7  satisfying  (7.5).  To  this  end,  we  observe 
that  for  every  7  >  0,  there  exists  J(rj)  such  that 


£  Mp< 


Choose 


V  ■=  (!  -  IMIa(N))/2- 

Then  0  <  7  <  1/2  and  we  define  the  factor  sequences  7  and  5  by 

1 


7 j  :=  (1  +  7 ])ctj  and  5j  :  = 


1  +  ?? 


3  <  J(v), 


and 


7 j  =  (Xj  and  Sj  =  p,  j  >  J(rj). 


Then  we  read  off  (7.7)  and  (7.8)  that 


|#»(N)  <  nrax{(l  +  rj)  1,  |ja||^N)}  <  max{(l  +  rj) 


\a\ 


A(N) 


}<1. 


We  also  have 


Il7ll/1(N)  <  (1  +  »7)II«II<1(N)  +  X/  lailP  ^  (!  +  r?)(1  _  2??)  +  x  <  1  —  x  <  1- 

Finally,  we  obtain  with  p'  :=  p/(  1  —  p) : 

||d'||p',(N)  <  J(7?)(l  +  rj)~p'  +  ||a||£,(N)  <  00. 

In  order  to  obtain  the  bound  (7.3),  we  use  (7.6)  which  states  that 

.(1  - 

^  I1  “  ll7ll/i(N)]_1exp/ 

From  (7.10),  we  find  that 

<7 

1-1 


p 

tv'  (N) 


p(i-IWl(N)) 


[i  -  II7||a(N)]  <  -• 


(7.7) 

(7.8) 

(7.9) 

(7.10) 


(7.11) 


(7.12) 


(7.13) 
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From  (7.9),  we  infer 

lklU°°(N)  <  nrax{(l  +  77)  (1  -  2 rj)1~p}  <  max{l  -  1  -  2(1  -  p)rj}  <  1  -  (1  -p)|, 

where  we  have  used  the  fact  that  rj  <  ^  and  therefore 

1  -  ||^||?oo(N)  >p'(l  -P)|  =  \o-  (7-14) 

From  (7.11),  we  infer 

ll<yll^(N)  <  •JfaX1  +  7))~p'  +  l|a||^(N)  <  J(rj)  +  IMI£.(N)-  (7-15) 

Inserting  the  estimates  (7.13),  (7.15)  and  (7.14)  inside  (7.12)  we  obtain  (7.3).  o 

We  now  combine  the  above  result  with  the  estimates  (6.4)  and  (6.5),  taking  as  a  the  sequences 
b  and  d.  Note  that  according  to  (2.4),  these  sequences  respectively  satisfy  ]|6||^i  <  n  and 
||d|k  <  (3k  under  Assumption  2.  We  thus  obtain  the  following. 

Corollary  7.3  Assume  that  Assumption  3  holds  for  some  p  <  1. 

(i)  If  moreover  Assumption  2  holds  with  n  :=  4,  then  (||cI/||v’)ve^  €  £P(F) 

(ii)  If  moreover  Assumption  2  holds  with  k  :=  1,  then  (||cI/||v'||LI/||x(oo(m)I/e^  g  IP(F). 

Combining  Corollary  7.3  with  (5.11)  and  (5.12)  and  using  Lemma  5.5,  we  obtain  the  following 
error  estimates  between  u  and  u\. 

Corollary  7.4  Assume  that  Assumption  3  holds  for  some  p  <  1. 

(i)  If  dp  =  wdp  with  w  G  L°°(U)  and  if  Assumption  2  holds  with  k  :=  4,  then  there  exists  a 
sequence  (A n)ngn  C  IF  of  index  sets  A  of  cardinality  N  =  1,2, ...  such  that 

Ik  -  UAN\\L2(uy,dP)  <  CN~r,  r  = - -.  (7.16) 

P  ^ 

(ii)  For  a  general  p,  if  Assumption  2  holds  with  k  :=  1,  then  there  exists  a  sequence  (AAr)jvepj  C  IF 
of  index  sets  Ajy  of  cardinality  N  =  1,2, ...  such  that 

Ik  -  wAjv \\L2(uy,dp)  <  CN  r,  r  =  -  -  1.  (7.17) 

Remark  7.5  In  the  case  when  p  =  wdp  with  w  G  L°° ,  we  always  have  r  >  ^  and  therefore  the 
MC  rate  (1.7)  is  outperformed  as  soon  as  the  ifj  satisfy  the  required  summability  condition. 

Remark  7.6  In  the  case  where  {'fj)j>\  is  the  Karhunen-Loeve  expansion,  decay  estimates  on 
\\iftj  ||l°°  ensuring  its  lp -summability  are  available.  These  estimates  depend  on  the  smoothness 
properties  of  the  covariance  function  Ca(x,y),  see  [20,  18]. 
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bounds  (7.16)  and  (7.17).  The  bound  (7.3)  on 


Remark  7.7  It  is  obviously  interesting  to  estimate  the  size  of  the  constants  C  in  the  error 

^raT )  obtained  in  Theorem  7.2  allows 

'■  )  lv(T) 

us,  via  Lemma  5.5,  to  estimate  C  in  (7.16)  and  (7.17)  in  terms  of  the  summability  properties 
of  the  Karhunen-Loeve  expansion  (1.8)  of  the  input  data.  However,  this  bound  is  not  easily 
computable  since  it  involves  the  quantity  J(rj)  which  might  actually  be  arbitrarily  large  under  no 
assumption  other  than  a  £  £P(N).  More  can  be  said  under  the  (slightly)  stronger  assumption 
that  a  £  £q(N)  for  some  q  <  p.  Assuming  without  loss  of  generality  that  a  is  non-increasing,  we 
find  from  Lemma  5. 5  that 

"  lp  < 


\  ^ 


j>J 


\a 


3 1  — 


I  IIP  T  — 

lQIU(Nr  9  ' 


We  therefore  find  that 


J(v)  <  ( 


V  \  — - — 

'  q-p  . 


2||a| 


^(N) 


which  leads  to  the  computable  bound 


Av\\  2  / 

)H^)  ^  ~exP{ 


4(1  —  p)(( 


2||al|P 


q 

I  q-p 


iq(  n) 


p2r] 


a 


(7.18) 


8  Space  discretization 

Up  to  this  stage,  our  results  allow  us  to  draw  a  comparison  between  the  convergence  rate  of  MC 
and  deterministic  methods  in  terms  of  the  number  N  of  deterministic  unknown  functions  which 
need  to  be  determined  in  such  methods.  The  actual  computation  of  these  unknown  functions 
involves  space  discretization,  which  is  the  source  of  additional  approximation  error. 

The  purpose  of  this  last  section  is  to  analyze  these  aspects  in  order  to  draw  a  more  exact 
comparison  between  the  convergence  rate  of  the  two  methods,  now  expressed  in  terms  of  the 
total  number  of  degrees  of  freedom  N^of ■  For  the  sake  of  simplicity,  we  shall  focus  on  space 
discretizations  by  the  finite  element  method,  although  our  discussion  can  be  extended  to  other 
types  of  discretization. 

In  order  to  establish  convergence  rates  in  the  above  sense,  we  need  to  give  regularity  estimates 
of  the  solution  u  in  the  physical  domain.  While  this  is  classical  for  linear,  elliptic  equations,  we 
require  a-priori  estimates  uniform  in  the  parameters  y  £  U. 

For  this  purpose,  additional  assumptions  are  needed.  We  first  recall  that  when  the  domain 
D  is  either  a  smooth  domain  or  a  convex  polyhedron  with  straight  faces,  the  solution  v  to  the 
Laplace  equation 

-Av  =  f  in  D,  v\dD  =  0,  (8.1) 

with  /  £  L2(D)  belongs  to  Hq(D)  (~l  H2(D).  This  is  well-known  to  yield  a  convergence  rate 
for  the  finite  element  method  on  families  of  shape-regular,  quasiuniform  meshes  of  meshwidth 
h:  if  (Vh)h> o  is  a  one  parameter  family  of  finite  element  spaces  associated  to  a  family  of  shape- 
regular  and  quasiuniform  partitions  of  D  into  simplices  of  meshwidth  h  >  0,  we  have  the  standard 
approximation  estimate 

inf  ||u  -  vh\\v  <  Ch\\v\\H2{D),  (8.2) 
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i.e.  convergence  rate  C?(M-3)  with  M  :=  dim(V)j)  ~  h~d.  The  I42-smoothness  estimate  is 
lost  when  working  on  non-convex  polyhedrons,  however  it  is  well  known  that  the  rate  M~d 
may  sometimes  be  retained  by  using  continuous,  piecewise  linear  finite  elements  on  certain 
nonuniform  meshes. 

This  is  in  particular  the  case  on  2-d  polygonal  domains  with  reentrant  corners.  In  order  to 
include  this  case  in  our  analysis,  we  introduce  the  following  general  assumption. 

Assumption  4.  The  domain  D  is  such  that  the  subspace 

W  :=  {v  G  V  ;  Av  €  L2(T>)}, 

equipped  with  the  norm  ||v||w  :=  ||Av||L2  has  the  approximation  property 

inf  ||i>  —  vm\\v  <  CaM~s\\v\\w,  (8.3) 

for  some  s  >  0,  where  (Vm)m> o  is  a  family  of  finite  element  spaces  such  that  dim(V^f)  <  M. 

Under  Assumption  4,  the  finite  element  approximation  um  of  the  solution  u  of  (8.1)  satisfies 

Ijit  -  um \\v  <  CaM~$^ | / 1 1 l2 (d)  (8-4) 

When  D  is  smooth  or  convex,  the  space  W  coincides  with  H2(D)  fl  Hq(D),  and  s  =  \  when 
Vm  is  chosen  as  space  of  continuous,  piecewise  linear  finite  elements  on  a  family  of  regular, 
quasiuniform  simplicial  meshes. 

In  order  to  establish  similar  approximation  estimates  on  the  solution  of  the  problem  (1.1) 
with  spatially  inhomogeneous  random  coefficients,  we  need  a  smoothness  assumption  on  these 
coefficients. 


Assumption  5.  There  exists  a  constant  Cr  >  0  such  that 


II^®IIl°0(!7,I/00(D))  sup  ||Va(-,y)||£oo(£>)  <  Cramin. 

y£U 


(8.5) 


Since  (1.1)  can  be  rewritten 


—A u  =  -[/  +  Va  •  Vm]  =:  g , 

we  can  estimate 

Mil-  <  4-[II/Md)  +OII/IM  <  1  +  CrCf  11/11^(0),  (8.6) 

O'  min  ^min 

where  Cp  denotes  the  Poincare  constant  of  D.  We  thus  obtain  from  Assumption  5  a  smoothness 
estimate  for  the  solution  of  (1.1): 


i  ii  ^  n  1  +  CrCp 

\u\\L°°(y,w)  A  U2  - \\J\\l2{d)- 

CLrr\\x\ 


(8.7) 
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For  comparison  purposes,  we  now  establish  a  convergence  estimate  for  the  MC  methods  with 
space  discretization.  Let  M  be  fixed  and  for  each  instance  at,  let  u^m  £  Vm  be  the  Galerkin 
projection  of  U{  onto  Vm  which  is  dehned  by 


find  uitM  e  VM  ■ 


j  ",V iii. mV r.y 


D 


J  fvM  for  all  vM  G  VM- 
d 


We  define  the  corresponding  approximation  to  the  mean  field  by 


un,m  '■= 


1 

N 


N 

Yuj,M. 

1=1 


Combining  our  assumptions  with  Cea’s  lemma  yields 


Ui  —  «i,Af||v  <  C\  inf  \\ui  —  umIIv  <  CaC\M  s\\ui\\w  <  CaC\C2M  s, 
vm&Vm 


for  each  i  =  1,  •  •  • ,  N,  with  C\ 


We  therefore  have 


1 


N 


<  ~Y \\iu- 

N  ^ 

1=1 


| un  -  un,m\\v  S  >  || iH  —  u^m\\v  <  CaC\CiM 


Combining  this  with  (1.7),  we  obtain 

E(||«  -  un,m\W)  <  BN -3  +  CaCxCiM-s. 


(8.8) 


The  total  number  of  degrees  of  freedom  appearing  in  Monte-Carlo  Finite  Element  simulation 
with  N  “samples”  is  N^of  :  =  NM.  To  optimize  estimate  (8.8)  with  respect  to  a  given  total 
number  of  degrees  of  freedom  N^of  '■=  NM,  we  take  N  ~  M2s.  Then  N^of  ~  M2s+l  and  we 
obtain  the  error  estimate 

E(||u  -  un,m\\v)  <  CN~oY~\  (8.9) 

where  the  constant  C  depends  on  B,  Ca,  Cj  and  C?. 

We  next  turn  to  the  deterministic  method.  We  incorporate  the  space  discretization  as  follows: 
for  any  subset  A  C  F  of  finite  cardinality  and  any  vector  M.  =  (Mu)u£\  of  positive  integers,  we 
define  the  approximation  space 

Xa,m  ■=  {vam(xiV)  =  ^MaO-Ms/)  ;  w  €  VMv}- 

A 

We  define  the  corresponding  Galerkin  approximation  u\tM  =  X^eA  uv,M  to  u  as  the 

unique  solution  to 

B(ua,m,vA,m)  =  F(vA,M ),  (8.10) 

for  all  va,m  £  Xa,m->  where  B  and  F  are  defined  by  (5.2).  The  total  number  of  degrees  of 
freedom  is  now  given  by 

Ndof  =  Yj  ^l" 

u£  A 
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We  first  mimic  the  analysis  of  the  Galerkin  approximation  in  §3:  from  Cea’s  lemma,  we  get 

\\u  -  u^m\\l2(u,V4p)  —  C'i\\u  ~  cu,MvLu\\L2^uVdp),  (8.11) 

VGA 

for  any  cV)mv  £  Vmv-  Specifically,  we  take  cv>mv  to  be  the  U-orthogonal  projection  of  the 
Legendre  coefficient  cv  onto  Vjw  .  Similar  to  the  discussion  in  §3,  we  distinguish  two  cases: 

•  Case  1:  if  dp  =  dp,  we  obtain  by  orthogonality 

11^  ~  uA,M\\L2(u,v,dp)  E  \\cv\\v  +  \\cv  ~  evilly)  ■  (8-12) 

v^A  vgA 

We  also  reach  (8.12)  up  to  a  change  in  the  constant  C\  if  dp  =  wd/i  with  w  G  L°°(U). 

•  Case  2:  in  the  case  of  general  p,  we  obtain  by  the  triangle  inequality 

\\u~  ua,m\\l2(u,V4p)  —  Ci  (E  IM|v||-MIl°°(;7)  +  ^2  || cv  -  || v \\Lv\\l°°(u))-  (8.13) 

i/fA  ugA 

The  right  hand  side  of  the  estimates  (8.12)  and  (8.13)  are  similar  to  (5.11)  and  (5.12)  up  to 
an  additional  term  reflecting  space  discretization.  From  (8.3),  we  obtain 

\\cv  —  c^mJIv  <  CaM~s\\cv\\w ■ 

Under  the  assumptions  of  Corollary  7.4,  we  thus  obtain  from  (8.12)  in  the  first  case 

\\u  -  u\tM\\L2(uy,dp)  —  c(N~2r  +  yM-2s\\cv\\2w)  ,  (8.14) 

VGA 

where  N  :=  #(A)  and  from  (8.13)  in  the  second  case 

\\u  ~  ua,m\\l2(u,V4p)  —  c(^N  '  +  ^2  6||cJ/||w"||L1/||Loo(j7)^  .  (8.15) 

vgA 

Based  on  these  estimates,  we  optimize  the  discretization  parameter  At  =  in  order  to 

estimate  the  best  possible  convergence  rate  for  the  deterministic  method  in  terms  of  the  total 
number  of  degrees  of  freedom.  The  optimization  problem  to  be  solved  consists  in  minimizing  the 
number  of  degrees  of  freedom  under  the  constraint  that  the  additional  term  reflecting  space  dis¬ 
cretization  remains  of  the  same  order  as  the  first  term  reflecting  discretization  in  the  y  variable, 
i.e. 

Min{^  Mv  :  M;2s\\cv\\2w  <  N~2r},  (8.16) 

uG A  uGA 

in  the  first  case  and 

Min {^2MV  :  ^2m-s\\c4w\\l4l^(u)  <  N~r},  (8.17) 

uG A  uG A 

in  the  second  case.  We  solve  both  problems  by  treating  the  Mu  as  continuous  variables,  up  to 
finally  taking  the  integer  value  of  the  solution.  For  (8.16),  introducing  a  Lagrange  multiplier, 
we  obtain 

W  G  A 
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where  the  value  of  A  is  given  by 


N-2r  =  YJM~2s\\cv\\2w  =  A-^Mv  =  A~&  J2  IMI w23- 

VGA  vGA 

Two  situations  may  occur  depending  on  the  summability  properties  of  the  sequence  (HcvIIwO^jf: 

•  If  (llc^llw)^^7  £  £p(lF)  with  p  =  j^2^,  we  obtain  that  AT  1+2s  ~  N~2r.  It  follows  that 

Ndof  =  ^2  Mv  =  AN~2r  ~  ~  Nzs . 

vGA 

We  therefore  obtain  the  convergence  rate 

llM  ~  uA,M\\L2{uy,dP)  <  CNdoSf.  (8.18) 

•  If  (||cI/||w)i/e.F  £  lv{T)  for  some  p  >  jqq^q,  we  can  estimate  A  by  using  Holder’s  inequality 
as  follows: 

A^N-2r  =  J2  \K\lW8  <  (X)  =  CNS, 

ugA  uG  a 


with  5  :=  1 


2 

p+2sp 


>  0.  This  leads  to 


Ndof  =  J2m-  =  AN~2r  ~ 

vgA 


(2r+S)(l+2s)  „ 

N  2s 


2r+S(l+2s) 

=  N  2s 


We  therefore  obtain  the  convergence  rate 

ll«  _  uAM\\L2{uy,dP)  <  CNdo^r+5(1+2s) 


(8.19) 


Remark  8.1  The  first  estimate  (8.18)  shows  that  if  the  sequence  (||cj/||w)i/e^r  is  sufficiently 
concentrated,  the  rate  of  convergence  of  our  method  is  similar  to  solving  one  single  deterministic 
problem  and  therefore  optimally  fast.  On  the  other  hand,  since  we  have  by  ParsevaVs  equality 


\\cAw  ~  \\u\\2L2(u,w,dp,)  — 

vGT 


u 


<  O  '2 
’( U,W )  -  °2 


with  C*2  as  in  (8.7),  we  are  always  ensured  that  (||c„|| w)vgF  £  £2(iF).  -In  the  worst  case  p  =  2, 

o  _  2sr 

the  rate  of  convergence  is  given  by  the  second  estimate  (8.19)  with  6  =  therefore  N  2r+2s 

which  is  still  faster  than  the  MC  rate  (8.9)  if  r  >  \,  since  2r+os  ~  1X2 7  >  0  then. 


By  applying  a  similar  analysis  to  the  optimization  problem  (8.17)  we  obtain  that  (8.18)  holds 
provided  that  (\\cu \\w\\La’\\Loo(u))i'GJr  £  £p(iF)  with  p  =  <1.  If  this  sequence  belongs  to 

£p(fF)  for  some  p  >  jW,  we  obtain  the  final  error  estimate 


u 


-  «a|| L2{U,V,dp)  <  CN( 


r+<5(l+s) 

dof 


(8.20) 


with  5  :=  1  — 


p+sp 


>  0. 


In  view  of  these  results,  our  last  task  is  therefore  to  analyze  the  H’-summability  properties  of 
the  sequences  (||c„|| w)vgT  and  (||cI/||w||-^i/||L°°(m)^e^r-  We  proceed  in  a  similar  way  as  for  the 
sequences  (||c„|| v)vgF  and  ( 1 1 1 1 v" 1 1 1 1 (17) ) j  estimating  first  the  derivatives  \\dy'u\\Laa(jj^Wy 
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Theorem  8.2  Let  the  sequence  b(s)  =  (bj(e))j> i  be  defined  by 


bj(e)  bj  +  e(||  ||l°°(D)  +  Clr||V,jllLoo(D))i 

where  Cr  is  the  constant  in  Assumption  5,  bj  :=  j  =  i;2, ...  and  e  >  0  is  arbitrary. 

J  amin 

We  then  have 

\\dyu\\L°°(u,W)  —  Cz\v\ }-b{£Y  (8-21) 

for  all  u  <E  T,  where  C3  :=  ( B  +  \\f\\Lz(D))  ■ 

Proof:  for  fixed  y  £  U,  v  G  T  we  introduce  the  notation  :=  V  •  (a(x,y)VdyU(x,y)),  and 

remark  that  the  function  dyu(x,  y )  is  the  solution  to  the  elliptic  problem 

-V  •  (a(x,y)'VdyU(x,y))  =  -vv(x)  in  T>,  y)|aD  =  0.  (8.22) 

Using  the  regularity  estimate  (8.7),  we  obtain  that 

\\dyu(-,y)\\w  <  1  +  CrCp\\vu\\LfiD)-  (8-23) 

flmin 

We  now  estimate  ||ui,||.l2(.d)-  To  this  end,  we  start  from  the  identity  (4.10):  for  any  y  G  U  and 
any  v  6  V 


j  V-(a(x,  y)VdyU(x,  y))v{x)dx+  X  Uj  [Vfij(x)-'Vdy  Cj  u(x,y)+ifj(x)Ady  eju(x,  y)\v{x)dx  =  0. 

d  1  r-  Ki#o}  1 

Taking  here  v  =  tv  G  U  and  using  the  Cauchy- Schwarz  inequality  we  obtain 

\K\\l^d)  <  X]  ^(\\v^jh^(D)\\9y~e:,u(-,y))\\v  +  \\i>j\\L™(D)\\^dy~e3u{-,y))\\L2{D^). 

{j:  Uj^ 0} 

(8.24) 

We  next  observe  that  it  follows  from  (8.22)  and  (8.6)  with  (8.5)  in  Assumption  5  that  for  any 

yeU 

\\A(dy~eju(-,y))\\L2{D)  <  —  ||w-ej  (',  2/)  ||l2(d)  +  Cr\\dy~ej u(- ,  y))\\v ■ 

ttmin 

Inserting  this  in  (8.24)  implies 

IKHl2(d)  <  X  ^((ll^^'lli°°(o)+C,r||'0j||L°°(D))ll^j/  2/))l|y+^illW-ej  IIl2(d))  (8.25) 

U  - 

with  bj  as  in  (4.8).  Multiplying  (8.25)  by  e  >  0  and  adding  it  to  the  estimate  (4.11)  established 
in  the  proof  of  Theorem  4.3,  we  obtain 

\\dyu(-,y)\\v  +  £\K\\L2{d)  <  X  Vjbj(\\dreJu(->y)\\v  +  zh’v-ejhfiD)) 

{j:  Vj^ 0} 

+  X  z/1£(II^V’jIIloo(d)  +  Cr\\fi’j\\L°°{D))\\dy  Ju(-,y))\\v, 

{j:  0} 

and  therefore 


\\dyu(-,y)\\v +£\\vu\\L2{D)  <  X  "jbj(£)(\\dy  eJu(-iy)\W +4vv-ej\\LfiD))-  (8-26) 

(j:  y,A0} 
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Using  the  same  reasoning  by  induction  as  in  the  proof  of  Theorem  4.3,  we  infer  from  (8.26)  that 

\\dyu(-,y) \\v  +  £\\vv\\l*{d)  <  (\\u(-,y)\\v  +  e\\V  ■  (a(x,y)Vu(x,y))||L2(D))|i/|!6(£)I/ 

<(B  +  \\f\\LHD))\v\\b(ey 

for  all  v  £  T.  We  thus  have 

llwll l2(d)  <  ~{B  +  ||/||l2(d))  1^1 

which,  together  with  (8.23),  concludes  the  proof.  o 

We  next  proceed  similar  to  the  study  of  the  summability  of  (||cI/||y)i/e.F  and  (||c„||  v\\Lv\\L<x>(u))veT 
For  this  purpose,  we  introduce  the  following  analog  to  Assumption  3  for  the  functions  VV’j- 

Assumption  6.  The  sequence  dlVVy  ||l°°(d)).7>i  belongs  to  /P(N)  for  some  p  <  1: 

^  llz,oo(x?)  <  00 

3>  1 

Using  the  fact  that  e  can  be  chosen  arbitrarily  small  in  the  statement  of  Theorem  8.2,  we  reach 
the  following  analog  to  Corollary  7.3. 

Corollary  8.3  Assume  that  Assumptions  4  and  5  hold,  and  that  Assumptions  3  and  6  hold  with 
the  same  p  <  1. 

(i)  //Assumption  2  holds  with  k  :=  4,  then  (||cJ,||iy)i'e.7:'  £  lp(tF)  and  we  obtain  the  error  bound 
(8.18)  ifp  =  2^i  and  (8.19)  if  p  >  ^py. 

(ii)  //Assumption  2  holds  with  k  :=  1,  then  ( 1 1 1 1 yy 1 1 1 1 (c/) ) and  we  obtain  the 
error  bound  (8.20). 

9  Conclusion  and  perspectives 

The  deterministic  approach  which  is  studied  in  this  paper  outperforms  the  Monte-Carlo  ap¬ 
proach  in  terms  of  convergence  rate,  provided  that  the  expansion  of  the  random  coefficient  a 
in  the  basis  i pj  has  some  summability  properties  in  the  L°°  norm.  Our  analysis  is  restricted 
to  random  coefficients  which  are  uniformly  elliptic  in  the  sense  of  Assumption  Al.  We  ex¬ 
pect  that  similar  conclusions  hold  in  different  settings,  in  particular  log-normal  coefficients,  i.e. 
a(x,uj)  :=  exp (b(x,uj))  where  b  is  a  Gaussian  random  field.  In  this  setting,  the  Legendre  poly¬ 
nomials  need  to  be  replaced  by  the  Hermite  polynomials  which  are  orthonormal  with  respect 
to  the  Gaussian  measure.  We  remark  that  the  analytic  regularity  results  Theorems  4.3  and  8.2 
were  obtained  by  real-variable  inductive  arguments.  A  different  avenue  of  their  proof  is  through 
techniques  of  several  complex  variables;  this  is  explored  in  the  forthcoming  work  [6]. 

This  paper  has  been  concerned  with  establishing  results  on  the  approximation  of  solutions  to 
stochastic  and  parametric  problems  by  finite  dimensional  adaptively  chosen  Galerkin  subspaces. 
Our  analysis  does  not  propose  a  specific  algorithm  for  identifying  these  subspaces.  Our  results 
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should  therefore  rather  be  considered  as  a  benchmark  for  the  convergence  analysis  of  numerical 
methods  for  the  approximation  of  parametric  and  stochastic  PDE’s  in  the  x  and  y  variables. 
The  two  most  commonly  used  numerical  methods  are  Galerkin  projections  (as  described  in  §5 
and  §8)  and  collocation  [2,  12,  13].  In  order  to  retrieve  the  same  convergence  rates  which  are 
proved  in  the  present  paper,  such  methods  need  to  be  developed  within  an  adaptive  framework, 
with  the  goal  of  selecting  proper  sets  An  and  finite  element  spaces  Vv  throughout  the  numerical 
computation.  This  will  be  a  subject  of  our  future  work  but  we  can  provide  some  preliminary 
comments  on  finding  good  Galerkin  subspaces. 

In  the  proofs  of  our  convergence  theorems,  we  establish  a-priori  estimates  for  the  Legendre 
coefficients  Ijc^Hy.  This  suggests  a  first  strategy  that  consists  in  choosing  A  by  selecting  the  N 
largest  ones  from  the  available  a-priori  estimates.  Another  approach  that  might  be  more  effective 
is  to  adaptively  build  A  through  a-posteriori  error  estimates.  This  means  that  we  start  from 
the  coarse  set  A0  =  {0}  and  recursively  construct  a  nested  sequence  A”  using  error  indicators. 
Such  space  refinement  strategies  have  been  explored  in  the  simpler  context  of  adaptive  wavelet 
approximation  of  the  solution  to  a  single  elliptic  PDE’s,  see  in  particular  [5,  16].  In  these  works, 
it  is  shown  that  a  standard  bulk  chasing  strategy  based  on  a-posteriori  error  indicators  leads  to 
optimal  convergence  rates.  The  adaptation  of  this  approach  to  the  parametric  and  stochastic 
PDE’s  addressed  in  this  paper  is  currently  under  investigation.  A  critical  issue  is  also  the  proper 
adaptive  tuning  of  the  space  discretization  with  respect  to  the  different  indices  v  as  revealed  by 
our  analysis  of  §8. 

Finally,  let  us  reiterate  that  an  intrinsic  weakness  in  the  deterministic  approach  is  that  it 
assumes  a  complete  knowledge  on  the  probability  distribution  of  the  coefficients,  while  Monte- 
Carlo  is  applicable  when  we  only  have  a  sample  of  independent  instances  at  our  disposal.  In 
such  a  case,  one  may  still  hope  to  construct  a  deterministic  solution  u\  €  Va,  either  by  the 
collocation  method  or  by  a  Galerkin  system  similar  to  (5.3)  in  which  the  integrals  over  U  with 
respect  to  the  unkown  measure  dp  are  replaced  by  computable  empirical  expectations  based  on 
the  available  samples. 
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